Algorithms for regression and classification: robust regression and genetic association studies
نویسنده
چکیده
Regression and classification are statistical techniques that may be used to extract rules and patterns out of data sets. Analyzing the involved algorithms comprises interdisciplinary research that offers interesting problems for statisticians and computer scientists alike. The focus of this thesis is on robust regression and classification in genetic association studies. In the context of robust regression, new exact algorithms and results for robust online scale estimation with the estimators Qn and Sn and for robust linear regression in the plane with the estimator least quartile difference (LQD) are presented. Additionally, an evolutionary computation algorithm for robust regression with different estimators in higher dimensions is devised. These estimators include the widely used least median of squares (LMS) and least trimmed squares (LTS). For classification in genetic association studies, this thesis describes a Genetic Programming algorithm that outpeforms the standard approaches on the considered data sets. It is able to identify interesting genetic factors not found before in a data set on sporadic breast cancer and to handle larger data sets than the compared methods. In addition, it is extendible to further application fields.
منابع مشابه
Application of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives
Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...
متن کاملApplication of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives
Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...
متن کاملPredicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کاملIdentification of Genetic Polymorphism Interactions in Sporadic Alzheimer’s Disease Using Logic Regression
Objectives: Genetic polymorphism interactions are among the important factors in affliction with complex diseases like Alzheimer’s disease. The important goal of genetic association studies is to identify a combination of polymorphisms and measure their importance in increasing the risk of occurrence of such diseases. In this study, feature selection approach of logic regression was used to ide...
متن کاملکاربرد الگوریتمهای دادهکاوی در تفکیک منابع رسوبی حوزۀ آبخیز نوده گناباد
Introduction: Reduction of sediment supply requires the implementation of soil conservation and sediment control programs in the form of watershed management plans. Sediment control programs require identifying the relative importance of sediment sources, their quantitative ascription and identification of critical areas within the watersheds. The sediment source ascription is involves two...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009